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Local limit theorems are derived for the number of occupied urns 
in general finite and infinite urn models under the minimum condition 
that the variance tends to infinity. Our results represent an optimal 
improvement over previous ones for normal approximation. 



1. Introduction. A classical theorem of Renyi [29] for the number of 

empty boxes, denoted by /xn(rc.,M), in a sequence of n random allocations 
of indistinguishable balls into M boxes with equal probability 1/M, can be 
stated as follows: If the variance of fj,o(n,M) tends to infinity with n, then 
Ho(n,M) is asymptotically normally distributed. This result, seldom stated 

CN | in this form in the literature, was proved by Renyi [29] by dissecting the 

range of n and M into three different ranges, in each of which a different 

q^ , method of proof was employed. Local limit theorems were later studied by 

Sevast'yanov and Chistyakov [30] in a rather limited range when both ratios 
of n/M and M/n remain bounded. Kolchin [22] gave a very detailed study 

\q [ on different approximation theorems. For a fairly complete account of this 

<^ ' theory, see Kolchin, Sevast'yanov and Chistyakov [23]. Englund [9] later 

c | ■ derived an explicit Berry-Esseen bound. 

Multinomial extension of the problem was studied by many authors. In 
this scheme, balls are successively thrown into M boxes, the probability of 
each ball falling into the jth box being pj = pj(M), Y1o<j<mPj = 1- Qui ne 
and Robinson [27] showed that ifpjM is bounded for j = 0, 1, . . . , M — 1 and 

^ ■ if the variance of fio(n,M) tends to infinity with n (n is the number of al- 

locations), then the distribution of fio(n,M) is asymptotically normal. They 
indeed derived a Berry-Esseen bound for the normal approximation of the 
distribution. Their result remains the strongest of its kind in the literature. 
Note that the condition pjM = 0(1) for all j is one of the essential conditions 
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needed for proving the asymptotic normality of no(n,M) in many previous 
papers on multinomial schemes (see Hoist [15], Kolchin, Sevast'yanov and 
Chistyakov [23]); it implies that the general multinomial scheme studied in 
the literature is indeed not very far from the equiprobable one. Our major 
contribution of this paper is to show that this condition can be completely 
removed. Moreover, under the minimum condition that Var(/j,o(n, M)) — > oo, 
fj,o(n,M) satisfies a local limit theorem of the form 



(1.1) sup 



o(n, M) = fj,(n) + xa(n)) - 



2na(n) 



0(a(n) 



for some normalizing constants (i(n) and cr(n), with cr(ra) ~ Y / Var(^o( ri ) M)) . 
Our result is thus, up to the implied constant in the error term, optimum. 
Moderate and large deviations can also be treated by extending our method 
of proof, but technicalities will be more involved and the result will be of a 
less explicit nature; thus we content ourselves with result of the type (1.1). 

While the above finite urn schemes have received extensive attention in 
the literature due to their wide applicability in diverse fields (see Johnson 
and Kotz [19] and Kotz and Balakrishnan [24]), the model in which M = oo 
with pj fixed was rarely discussed. Bahadur [2], and independently Darling 
[7], seemed the first to investigate such an urn model. Karlin [20] gave the 
first systematic study of some basic statistics on this model. His results were 
then extended by Dutko [8] using the same approach. Dutko showed that 
in a sequence of n throws, the distribution of the number of occupied boxes, 
Z n .Mi is asymptotically normally distributed provided only that its variance 
tends to infinity with n. (His result was stated in a slightly weaker form.) 
Note that this result was already stated in the review by Kesten [21] for 
Darling's [7] paper in AMS Math Reviews. For other interesting aspects of 
Z n ,Mi see the two recent papers [5, 11]. 

We will derive a local limit theorem for Z n M of the form (1.1) [with 
yUo(n, M) replaced by Z Ui m] under the minimum condition that Var(Z ni jv/) — ► 
oo, where M is either finite or infinite, and pj either may depend on n and 
M or not. 

Note that the number of occupied boxes is equivalent to the number of 
distinct values assumed in a sequence of n i.i.d. (independent and identically 
distributed) integer-valued random variables. This quantity is an important 
measure in several problems such as the coupon collector's problem, species- 
trapping models, birthday paradox, polynomial factorization, statistical lin- 
guistics, memory allocation, statistical physics, hashing schemes, and so on. 
For example, the number of occupied urns under the geometric distribu- 
tion occurred naturally in at least two different problems in the literature: 
the depth (the distance of a randomly chosen node to the root) in a class 
of data structures called Patricia tries (see Rais, Jacquet and Szpankowski 
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[28]) and the number of distinct summands in random integer compositions 
(see Hitczenko and Louchard [13], Hwang and Yeh [16], Gnedin, Pitman and 
Yor [12]); see also Prodinger [26] and Janson [18]. 

Almost all previous approaches rely, explicitly or implicitly, on the widely 
used Poissonization technique, which is roughly stated as follows. Let {cij}j 
be a given sequence such that the Poisson generating function /(A) := 
e~ Z^jCijXi / j\ is an entire function. Then the Poisson heuristic on which 
the Poissonization procedure relies reads: 

(1.2) if /(A) is smooth enough for large A, then a n w f(n) (n — > oo). 

Such a heuristic, guided by the underlying normal approximation to the 
Poisson distribution, can usually be justified by suitable real or complex 
analysis. As is often the case, it is the verification of the smoothness (or 
regularity) property of /(A) that is the hard part of the heuristic and for 
which technical conditions are usually introduced. The heuristic appeared in 
different guises in diverse contexts such as Borel summability and Tauberian 
theorems; it can at least be traced back to Ramanujan's Notebooks; see the 
book by Berndt [4], pages 57-66, for more details, Aldous [1] and the survey 
paper by Jacquet and Szpankowski [17] for thorough discussions. 

To obtain our local limit theorems, we apply instead the two-dimensional 
saddle-point method, which is in essence the most straightforward one and 
may be regarded as an extension of the Poisson heuristic; see also Re- 
mark 3.2. The approach we use can be extended in a few lines: moderate 
and large deviations of Z n% M, consideration of other statistics such as urns 
with a given number of balls, weighted coverage, goodness-of-fit tests, etc. 

This paper is organized as follows. We first state our main results on local 
limit theorems in the next section. In Section 3 the case of a Poisson number 
of balls is considered, and we introduce the Poisson generating function that 
is central to our proofs. Asymptotics of mean and variance are derived in 
Section 4. Sections 5-7 give the proofs of the main results. Discrete limit 
laws are derived in Section 8. We conclude this paper with some properties 
of infinite urn models. 

Notation. The generic symbols C, Ci,C2,... and ci,C2, ... will always 
denote some positive absolute constants; they can be replaced by explicit 
numerical values if desired, but we avoid this for simplicity of presentation. 
Similarly, the implicit constants in the O- and x-symbols are absolute con- 
stants, where the symbol Ax B means that c < A/B < C for some constants 
c and C. 

2. Results. Let X\,X2, ■ ■ ■ ,X n be a sequence of i.i.d. random variables 
with a discrete distribution F. Let Z = Z Tlt p denote the number of distinct 
values assumed by X± , . . . , X n . 
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Let J be the set (finite or infinite) of possibie values of Xj, and let the 
distribution F be given by 

F(x l = j)= Pj (jej), 

where Y^jPj = 1 • Here and throughout this paper, sums of the form J2 j are 
taken to be J2jej unless otherwise specified; similarly ]1 ■ = Ylj £ j- 

Alternatively, Z counts the number of occupied urns in an urn scheme 
where n balls are thrown independently and each ball has the same proba- 
bility pj of falling into urn j, j £ J . Note that we allow pj = for some j, 
although such elements may be freely added to or deleted from J without 
changing Z. 

We now state our results. Proofs are given in Sections 4-7. 



Theorem 2.1. IfVax(Z n ^) ^0, then the local limit theorem 

C 



(2.1) sup 

— oo<x<oo 



P(Z n , F = m) 



-x 2 /2 



27rVar(Z nj i?) 



< 



Xai(Z n , F ) 



holds uniformly for all n £ N and F , where m = \K(Z ni p) + xJYai(Z n ^)\ . 

The trivial case Var(Z n) i?) = occurs if and only if n = 0, n = 1, or F is 
a one-point distribution; in these cases Z = 0, 1 and 1, respectively. 

Remark 2.1 (Discrete distributions vs. continuous distributions). The 
assumption that the distribution F is discrete is not necessary. If F is con- 
tinuous, then Z = n a.s., another trivial case with Var(Z) = 0. If F has both 
a discrete and a continuous part, then Theorem 2.1 still holds. To see this, 
assume that F has a continuous part with total mass p, and let Fm be a 
discrete distribution that has the same atoms as F together with M new 
atoms j, each with pj = p/M. We can now apply Theorem 2.1 to Fm, and 
it is easily seen that if we let M — > oo (with n fixed), then all quantities 
in (2.1) for Fm converge to the corresponding quantities for F; thus (2.1) 
holds for F also. 

Similarly, the result below holds for general distributions with minor mod- 
ifications in the formulas (2.2)-(2.5) for mean and variance. We omit the 
details. 

Remark 2.2 (Finite urns vs. infinite urns). By a suitable truncation, 
it suffices to prove the results for a finite set J . This has the technical 
advantage that we do not need to address the convergence of the sums 
and products involved, which is, however, relatively easily checked. Indeed, 
without loss of generality, we may assume that J is the set of nonnegative 
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integers; then we replace X{ by Xi f\ M , and let Fm be the corresponding 
distribution (i.e., F truncated at M ). It follows that if the result holds for 
each Fm-, it also holds for F, by letting M — > oo. 

Exact formulas for E(Z nj _p) and Yax{Z n ^ F ) are given in (4.1) and (4.2) 
below. However, these formulas are rather complicated; thus we first derive 
simpler approximations to these quantities. 

We define, for x > (and, more generally, for any complex x with Kx > 0), 

(2.2) M H*):=Bl-e-^), 

3 

(2.3) v F (x):= y £ i e-*i*(l-e-n% 

3 

(2.4) u F (x):=Y,Pjxe-^ x , 

3 
m r\ 2 / \ I \ Uf(x) 2 

(2.5) a F (x):=v F (x) y —^- 



(with cr F (0) := 0) and, for later use, 

(2.6) vp{x) :=x + v F {x) — 2u F {x). 

We will see in Section 3 that hf( x ) and vf{x) are the mean and variance 
of Z if the fixed number n of variables (balls) is replaced by a Poisson 
number with mean x > 0, and that u F (x), a F (x) and vf(x) also have simple 
interpretations in terms of this Poissonized version. Noting that 

\ E PiP3^~ PiX -^ p n 2 =J2pie- 2piX -(YlPi e ~ Pi:c ) . 

i,jej i \ i 1 

we obtain the following alternative formula. 

Proposition 2.2. 

(2.7) a\{x) = ^e- p ^(l - (1+ Pj x)e-™*) ^-^PiP^* - e~^f ■ 

j hi 

All terms in the sums in (2.7) are nonnegative for x > 0. Hence, cr F {x) > 
for any F and all x > 0. 

Theorem 2.3. The mean and the variance of Z n ^p satisfy 

(2.8) E(Z n)J? )=/i F (n) + 0(l), 

(2.9) Var(Z n , F ) = op{n) + 0(1). 
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The 0(l)-terms in (2.8) and (2.9) are in some cases o(l), as we will see 
later. 

We can thus replace the exact mean and the exact variance in Theorem 2.1 
by their asymptotic approximations. 



Theorem 2.4. If a-p(n) / 0, then uniformly for all n > 1 and F 



sup 

-oo<a;<oo 



F(Z n , F = [nF(n) + xo F (n)\) 



-x 2 /2 



2iro~F(n) 



< 



a 2 F (n)' 



These results are stated as approximation results. If we consider a se- 
quence of such variables Z Ui f, by letting n — > oo and varying F, assuming 
only that V&i(Z nt F) — > oo, Theorems 2.1 and 2.4 can be interpreted as local 
central limit theorems. The corresponding central limit theorem, with (the 
generally weaker) convergence in distribution, can be stated as follows. 

Corollary 2.5. Consider a sequence (n u ,F u ) u of integers n v and dis- 
tributions F u . Then the following statements are equivalent, with Z v := 
Z nu ,F„ and al := Var(Z u ): 



(i) al -^oo; 

4„ 



ii) a F (n v ) -> oo; 



(iii) (Z u -E(Z u ))/a u ^N(0,l); 
(iv) {Z v - liFMv))l°F v (nv) -^ N(0, 1); 
(v) (Z v — a u )/f3 u — > N(0, 1) for some sequences a v and (3 V with (3 U > 0. 

These theorems cover many results in the literature as special cases. 
From now on, the distribution F will be fixed, and we will generally drop 
the subscript F from the notation. 

For our method of proof, we consider two (technical) cases: 

(i) J2 P n <iPj — l/^) meaning roughly that asymptotics of Z n is domi- 
nated by small pj . 

(ii) J2 P n >iPj — l/^) meaning roughly that asymptotics of Z n is domi- 
nated by large pj . 

Obviously, at least one of these cases will hold. Here, the value 1/2 is not 
essential and can be changed to any small positive constant (with consequent 
changes in the values of some of the unspecified constants below); similarly, 
the cut-off at pjn = 1 is chosen for technical convenience. We will work 
with Z n in case (ii) and with Z n := n — Z n in case (i); it will turn out that 
Poissonization then works well in both cases. 
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Remark 2.3 (Exact distribution). It is easy to find the exact distribu- 
tion of Z rit F. Indeed, assuming as we may that the set J is ordered, 

n \ „fci . . . Jim 

fciH h Tim =nji <•••<>! 



(2.10) P(Z„,, = m) = VJ E. Ui,..,0^"^ 



This expression explains why such urn schemes are called multinomial allo- 
cations. However, it will not be used in this paper. 

3. Poissonization. We consider first the mean and the variance of the 
number of occupied urns or the number of distinct values when the number 
of balls or variables have a Poisson distribution. 

Recall that Z n is the number of occupied urns when we throw n balls. 
Then Z n :=n — Z n represents the number of balls that land in a nonempty 
urn. 

Consider now instead the case when the number N of balls is Poisson 
distributed. Let Z(\) denote the number of occupied balls with N = N(X) ~ 
Po(A) balls; let Z(X) := N(\) — Z(\) be the number of balls that land in an 
occupied urn. 

Remark 3.1 (A coupling). We may define Z n and Z(\) for all n > and 
A > simultaneously (for a given F) by throwing balls at times given by a 
Poisson process with intensity 1. We let Z(\) be the number of occupied urns 
at time A > 0, when Po(A) balls have been thrown, and let Z n be the number 
of occupied urns when the nth ball has been thrown. This defines the various 
variables simultaneously, with both Z n and Z n :=n — Z n increasing in n and 
both Z(X) and Z(\) := N{\) - Z(X) increasing in A, where N{\) ~ Po(A) is 
the number of balls thrown at time A. 

Let Uj be the number of balls in urn j. Then N = J2jUj, 

3 

where 1.4 denotes the indicator function of the event A, and 

Z = Z2 U J ~ Z = z2( U i - 1 iu 3 >i}) = z2 lJ 3 , 
j j j 

where Uj := Uj - 1{^>i}- 

In the Poisson case, the random variables Uj are independent and Pois- 
son distributed, with Uj ~ Po(pjA). By (3.1), Z(X) =J2jIji where Ij := 
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lm->n ~ Be(l — e~ Pj ) are independent Bernoulli random variables. It fol- 
lows that 

e(z(a)) = E E (^) = EC 1 - e_PjA ) = ^ A )> 

i i 

Var(Z(A)) = E Var(Ij) = £ e~W A (l - e"^ A ) = u(A). 
i i 

Similarly, Z(X) = J2j Uj, where Uj = (Uj + l{u =0} ~ 1) are independent. 
We have 

E(U j )=p j \ + e- p i x -l, 

Var(C^) = Var(C/ j ) + Var(l {[7 . =0} ) + 2 Cov([/ j , l {t /, =0} ) 

= Pi A + e-» A (l - e"« A ) - 2 Pj \e-* x . 

Accordingly [see (2.2)-(2.6)], 

E(Z(A)) = EM - (1 - e-^ x )) = A - M (A), 

3 

Var(Z(A)) = Efe A + e~ PjA (l - e" p ' A ) - 2p i Ae~W A ) 

(3.2) 

= A + u(A)-2u(A) = v(A). 

Remark 3.2 (vl connection between the two cases Ylpn<iPj — 3 anc ^ 
J2 P n >iPj ^ f)" ^ e have seen that /i(A) and v(A) are the mean and vari- 
ance of Z in the Poisson case. Similarly, it is easy to see that Cov(Z, N) = 
u(X), and thus 

^ *m v (7 , Coy (Z,N) 2 

(3.3) cx(A)=Var(Z)- ^-^ , 

which can be interpreted to be the smallest variance of a linear combina- 
tion Z -aN with a£R, that is, a 2 (A) = Var(Z - a N), where Z - a N 
is optimal in this sense, which, on the other hand, is also determined by 
Cov(Z — o.qN, N) = 0. (These are standard equations in linear regression, 
where (3.3) is the residual variance, and ao = Cov(Z, N)/ Var(iV) G [0,1] 
because Cov(Z, N) = u(X) < A = Var(iV).) 

Our method of proof is based on analyzing the Poisson generating function 
P defined below. This can be regarded as an analytical Poissonization, and it 
is, at least heuristically, strongly related to replacing Z n by the Poissonized 
Z(X) and then compensating for the randomness in N, the number of balls, 
in order to derive results for Z n . It is then natural to consider the projection 
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Z — oqN, which eliminates the first-order (linear) fluctuations in Z due to 
the randomness in N. Theorem 2.3 says that, with A = n, this projection 
has almost the same variance as Z n , which indicates that this projection 
(plus the constant aon) is a good approximation to Z n . Moreover, as we 
will see in Proposition 4.3 below, the smallest variance cr 2 (A) of a linear 
combination Z — aN is attained within a constant factor by one of the choices 
a = and a = 1, which gives Z and Z — N = —Z, respectively. Indeed, the 
arguments below can be interpreted as considering these two choices only. (It 
is likely that one could use similar arguments corresponding to the optimal 
projection Z — aoN, without splitting our analysis below into two cases.) 

We define, for complex z and y, P(z,y) to be the exponential generating 
function of E,(y Zn ) given by 

ft TL 7YI 

(3.4) P(z,y):=^^E(y z ") = £ ULp(Z n = m); 

we further define the Poisson generating function Q(z, y) := e~ z P(z, y). Note 
that for A > 0, Q(X,y) = E(y ' ') is the probability generating function of 
Z(X). It follows immediately from (3.1) for z > 0, and for general complex 
z by analytic continuation, that 

Q(z,y) = l[(l + (y-l)(l-e^n), 
j 

and thus, using J2jPj = 1> 

(3.5) P(z, y) = e/Q(z, y) = \{{l + y(e^ - 1)). 

j 

This also follows easily from (2.10); see also Karlin [20], Johnson and Kotz 
[19], Kolchin, Sevast'yanov and Chistyakov [23], Flajolet, Gardy and Thi- 
monier [10] for different derivations. 

Note also that the probability generating function of Z{\) = N — Z(X) is 
given by 

\n 

(3.6) Hy ZW ) = E ^e- A E(y"-^) = e^P^A,^ 1 ). 

According to the Poisson heuristic (1.2), if Q were smooth enough, then 
we would have 

E(y Zn )&Q(n,y) (n->oo), 

and the asymptotic normality of Z n would then follow from Taylor expansion 
of the cumulant generating function 

logE(e z " s ) « 8 J2(1 ~ e~ Pjn ) + yE e " W "( 1 " e ~ PjH ) + -". 
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provided that the second sum tends to infinity and the error term becomes 
small after normalization. However, the general situation here turns out to 
be more complicated. First, the variance of Z n is not necessarily of the 
same order as the second sum. Second, the growth order of Q(z,y) is not 
necessarily polynomial in \z\; for example, Q(z,0) = e~ z . Thus more refined 
arguments are required to properly justify the (implicit) underlying Poisson 
heuristic (1.2). 

4. Mean and variance of Z. We prove in this section the estimates (2.8) 
and (2.9) for the mean and variance of Z n , and some related estimates. 

Proof of Theorem 2.3. By straightforward calculations, (3.1) leads 
to 

(4.1) E(Z n ) = ^(l-(l- Pj ) n ), 

j 

var(z n )=E( l -^r( l -( l -^n 

(4.2) 

+E(( 1 -K-^r-( 1 -w) n ( 1 -^) n )- 

Now for p G [0, 1], we have 

< e~ pn -{l-vT< ne~ p{n -V ( e ~P -l+p) 
(4.3) 

= 0(p 2 ne- pn ) = 0(p), 

and thus (2.8) follows from (4.1). 

Similarly, by (1 — x) n = 1 — nx + 0(n 2 x 2 ) for < x < 1, we have, for n > 2, 

(i-Pini-P3) n -(i-Pi- Pj ) n 

- (1 -^ (1 -^( 1 -( 1 -(T^t^y) n ) 

" ( Pl){ Pj} {(i- Pi )(i- Pj ) + U \(i-P,?{i- P] ?)) 

= p l p 3 n(l-p l ) n -\l -pj)"- 1 + 0({p lP] n) 2 (l - Pl ) n - 2 (l -p 3 ) n - 2 ) 
= p iPj ne-P^-^ n + 0(p iPj ), 
since, by (4.3), 

e -pn _ (1 -p)™- 1 = e~ pn - (1 -p) n +p(l -p) 71 - 1 
= 0(p 2 ne- pn +pe- pn ) = 0(l/n), 
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and 

(p iPj n) 2 (l - Pi ) n ~ 2 (l - Pj ) n ~ 2 = 0{ PiPj -pine-W-pjne-W) = 0(p iPj ). 
Hence (4.2) yields 
Var(Z n ) = 5>~« n + 0( Pj ) - e~ 2 ^ n + 0( Pj )) 

3 

-J2( PiPj ne-^ n -^ n + 0( PiPj )) 

= Y,( e ~ p ' n - e~ 2p ' n ) - Y,PiPi ne ~ Pin ~ V3n + Y.p 2 i ne ~ 2nn + °( 1 ) 

3 ij i 

= E( e_Pjn " e ~ 2Pjn ) -n\J2pie- p A + O(l), 
which proves (2.9). □ 

We proceed to some estimates of v(x), v(x) and a 2 {x), which roughly 
indicate why we need to separate into the cases P jn> 1 and P jn< 1 in our 
manipulations of sums. 

Lemma 4.1. For x>0, we have 

(4.4) v(x) = Var(Z(x)) x YJ Pj x+ VJ e ~ PjX > 

PjX<l PjX>l 

(4.5) v(x) = Var(2T(x)) x Y] (Pjx) 2 + /J £>jx. 



Proof. This follows from the definitions (2.3) and (2.6) [see also (3.2) 
and the asymptotics 

x, if x — > 0, 

if x — > oo, 

and 



e -(l-e-)~{^ 



x + e- x (l-e- :r -2a;)~| 



x 2 /2, if x^O, 



X, 



if x — > oo. 



Lemma 4.2. Forallx>0 
(4.6) a 2 (x)< YJ (Pjx) 2 + YJ Pj x 

PjX<l PjX>l 

and 



a 2 (x)> Cl [yj( Pj x) 2 + yj e-^] 



(4.7) 

\pjX<l Pj3 
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Proof. Since v(x) < u(x) < x, we have 



a 2 (x) = v{x) - ^M < !^M( X _ u (x)) < x - u (x) =Y^PA 1 ~ e ~ W 



3 

from which the upper bound in (4.6) follows. 

On the other hand, by Proposition 2.2, a 2 {x) > J2j e~ P: > x (l — (l+pjx)e~ P: > x ) 
which yields (4.7) by the elementary inequality 1 — (1 + x)e~ x > C2 min{l, x 2 }. 

D 

Note that by the inequality 1— (l + x)e~ x > x 2 e~ x /2, we also have o~ 2 (x) > 

\Y,M x ? e ~ 2piX - 

The following result, based on the estimates we just derived, is crucial for 
the development of our arguments. 

Proposition 4.3. a 2 (x) xmin(v(x),v(x)). More precisely: 



W tfE„,x<iP;>iA th 



en 



a (x) x v(x); 



ii) ^Ep.-a;>lPi>lA t/j 



en 

(T 2 (x) X 1>(x). 

Proof. The upper bounds are immediate: o~ 2 (x) < v(x) by definition, 
while o- 2 (x) =0(v(x)) by (4.5) and (4.6). Alternatively, as pointed out by 
one of the referees, the upper bounds also follow from Remark 3.2 and 
Var(Z) = v(X), Var(Z) = Var(Z - N) = v(X). 

For the lower bounds, we treat the two cases separately. 

Case (i): Y, Pj x<iPj > V 2 - We have 

xJ2PiPj(e~ PiX -e~ PjX ) 2 >x E E PiPji^ 1 -^ 2 ) 2 

i.j piX<lpjX>2 

^ (e- 1 ~ e~ 2 ) 2 ^ 

> g ^ PjX ' 

PjX>2 

Thus, using Proposition 2.2, 

E P 3 x = 0(a 2 (x)). 

PjX>2 

Moreover, by (4.7), 

E (PJ X ) 2 + E PjX = 0(o- 2 (x)). 

PjX<l KpjX<2 
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Hence, by (4.5), v(x) = 0(<j 2 (x)). 

Case (ii): J2 P x >iPj — V^- First, we have by Proposition 2.2 

Y Pix<2x y E p^j 

PiX<l/2 Pi x<l/2PjX>l 



0(xJ2PiPj( 



= 0{a\x)). 
Furthermore, by (4.7), 
(4.8) Yl Pi x + E e-** = 0(a 2 (x)). 

l/2<pjX<l Pjx>1 

Thus, by (4.4), v(x) = 0(a 2 (x)). □ 

Remark 4.1 (An interesting estimate). It follows from Lemma 4.1 that 
in case (i), v(x) xx, and in case (ii), v(x) xx. Hence, m&x(v(x),v(x)) xx. 

5. Local limit theorem when Ylpn>iPj — 1/2- We prove Theorem 2.4 
in this section when J2 P n >iPj — 1/2- Our starting point is the integral 
representation 

(5.1) p(Z n = m) = -^— 2-y_J_ e- im ^ in9 P(ne i9 ,e^)dedip, 

which follows from (3.4) by standard coefficient extraction. 

Our strategy is to apply the two-dimensional saddle-point method. More 
precisely, we split the integration ranges of the double integral into three 
parts: 

( 5 - 2 ) \e\<e + \e\<e + / e < \e\ <tt' 

M<VO ¥><)<M<t M < f 

where #o and </?o ar e usually so chosen that they satisfy the conditions for 
the saddle-point method: 

nd 2 ~^ OO) n $o — * an d o(n) 2 (p\ — > oo, cr(n) 2 99Q — ► 0. 

For technical convenience, we will instead choose 9q := n~ 1 ' 2 a(n) 1 ' 3 < n -1 ' 3 
and ipo := <r(n) _2 ' 3 , and the usual saddle-point method will require only 
minor modifications. 

We show that the main contribution to ¥(Z n = m) comes from the first 
double integral in (5.2), the other two being asymptotically negligible. As 
is often the case, the hard part of the proof is to prove the smallness of 
e~ n \P(ne , e lv )\ when at least one of {6, if} is away from zero. Note that 
P(n,l) = e n . 
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5.1. Estimates for \P(z,e 1(fi )\. We derive in this subsection two major 
estimates for \P(ne ,e lip )\ under the assumption J2 P n >iPj — V^- The cor- 
responding estimates for the case 2~2 pn <iPj ^ V^ will be given in the next 
section. 

Lemma 5.1. Let z = re ld , r > and 9 G R. Then: 

\e z -l\<(e r -l)e- r{1 - cose)/2 . 

Proof. We have 
\e z - 1| = 2\e z/2 \\ sinh(z/2)| < 2e (r/2)cose sinh(r/2) = e^ 2) "* e {e r/2 - e- r/2 ), 
from which the result follows. □ 

Lemma 5.2. Ifr>l and \9\ < n, then 

(5.3) l + |e rei9 -l|<e r - C37 " 02 . 

Proof. By Lemma 5.1 

(5.4) 1 + \e reW - 1| < 1 + (e r - i) e -Ki-«*fl)/2 < i + ( e r _ \) e ~^ e \ 

for |0| < 7r, where we can take C4 = 1/7T 2 by the inequality 1 — cos# > 29 2 /ir 2 
for \9\ < ir. Define C3 := C4/2. By the inequalities 

e csr0 2 + j < e r/2 + ! < e r ; 

we have 

! _ e -2c 3 r^ = (e c 3 r^ + 1)e -c 3 r^ (1 _ ^r^ < e r (e -c 3 r^ _ e -2c 3 ^ } 

The result (5.3) follows from this and (5.4). □ 

PROPOSITION 5.3. Assume that J2 P n >iPj - 1/2- Then the inequality 

\P{ne w ,e iv )\<e n - C5ne2 
holds uniformly for \9\ < ir and —oo<(p<oo, where c§ = C3/2. 

Proof. By (3.5), Lemma 5.2 and the simple estimate 1 + \e re — 1| < e r 
(e.g., by Lemma 5.1), 

\P(ne w ,e^)\ < fj(l + \e p ' neW - 1|) 



< f TT e Pjn I f TT e vj n - c m n6 

\pjn<l / \pjn>l 

= expf n-c 3 9 2 ^ pjti J . 

\ p,n>l / 
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□ 

Since a more detailed estimate for \P(ne , e 1<p )\ and a local expansion of 
P(ne 10 ,e lip ) (for small 9 and tp) involve several sums related to u(x), v(x), v(x) 
and o~ 2 (x), we now derive a few simple estimates for and relationships be- 
tween them. 

Lemma 5.4. For x>0, 

u(x) = 0(x A (v(x) + 1)(1 + log + x)). 

Proof. The upper bound x follows easily by the inequality e~ x < 1. For 
the other upper bound, we may assume x > 2. Then, by (4.4), 

(5.5) ^2 Pjxe~ PjX < ^2 PjX + logx ^2 e~ P: > x = 0(v(x)logx), 

PjX<\ogX PjX<l PjX>l 

while trivially T, PjX >iogxPj xe ~ PjX < T,jPj = 1- □ 

Lemma 5.5. Let x>l and 0<5< 1/2. Then: 

(i) v(x) = 0{v{{l-5)x)) andv{{l-5)x) = 0(x 25 v{x) + l); 
(ii) v(x) >:v((l — 5)x); 
(iii) a 2 (x) = 0(a 2 ((l-5)x)); 

(iv) if , furthermore, 5 < x l ' 3 , thenv((l — 5)x) = 0(v(x) + l) ando~ 2 ((l — 
5)x)=0(a 2 (x) + l). 

Proof, (i). We use (4.4) for both x and (1 — 5)x; note that we can 
split the sum according to pjX < 1 and pjX > 1 for (1 — 5)x, too. The first 
estimate then is obvious. For the second we find, assuming as we may x > 2, 

1Z Pj(l-$)x< Yl PjX = 0(v(x)), 
J2 e -Pj<i-S)<i> < e ^o g x J2 e - p * x < C 2 x 25 v{x), 

l<PjX<21ogx KpjX<2\ogx 

Y^ e-Pii 1 -*)* < Y, e- losx <l, 

PjX>2logx pjX>2\ogx 

since there are at most x/logx terms in the last sum. 
(ii). Immediate from (4.5). 
(iii) and (iv). Follow from (i) and (ii) together with Proposition 4.3. □ 

We now refine Proposition 5.3 and obtain a decrease of \P(ne , e llf )\ in 
both 9 and <p. (We are grateful to one of the referees for improving our 
previous version.) 
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PROPOSITION 5.6. Assume that J2 P n >iPj — 1/2- Then uniformly ft 



or 



\9\ < tt and \tp\ < n, provided v(n) > 1, 

(5.6) \P(ne w ,e iv )\ < exp(n - c & n6 2 - c 7 a 2 {n)ip 2 ). 

Proof. Let z := ne 1 ® = £ + w?, where £ := ncosO and n := nsinO. As- 
sume first that \9\ < 7r/4; then \rj\ < £ and £ > n/2. Thus n = 0(£) and, by 
Lemma 5.5(i), 1 < v(n) = 0(i>(£)). By explicit calculation 

\l + e^{e^ +[ll -l)\ 2 

= e ^ — 2e^(cosr] — cos((p + n)) + 2(1 — cost/?) 

= e ^(1 — 2(1 — cos(/5)e~^(cosr/ — e - ^) — 2e - ^ siny sin 77) 

< exp(2£ — 2e~^(l — cos ip) (cos r\ — e~^) + 2e~^\ siiup\\ sin 77 1 ) , 

which implies that 

|1 + e 1,p (e z - 1)| <exp(£ - (1 - cosi/?)e~ ? (cos?? - e~ ? ) + e~ ? | sin</j||r/|). 

This inequality (applied to piz) gives, by (3.5), 

(5.7) |P(ne ie ,e^)| <exp(£- (1 - cos^)Si + | sin^), 
where 

Si := Y, e- p ^(cos( Pj ri) - e~ p ^), 
j 

j 
By (2.5), u{i) 2 < £u(£), and thus 



(5.8) I sin<^|5 2 = 0(\e<p\u(£)) = O(^Jn9 2 v(0<P 2 )- 

For S\, let eg > be chosen such that cos(x) — e~ x > on (0, 2cs] (e.g., 
eg = 1/2), and decompose the sum into three parts: 

Si = (E+ E + E )e-^(cos(p J?? )-e-^) 

= :Ti+T 2 + T 3 . 

Consider first T\. For each term in T\ we have Pj\ij\ < pj£ < c%, and since 
the function x \-* (cosx — e~ x )/x extends to a continuous strictly positive 
function on [0, cs], 

cos(pjr]) — e~ Pj ^ > cos(pj£) — e~ Pj ^ > cgpj^. 
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Furthermore, e~ Pj ^ > e~ C8 , and consequently 

Ti>ciO ]T P^ 

For a term in T<i we have either Pj\n\ < eg and then 

cos(pjT]) - e~ p ^ > cos(cs) - e~ C8 > 0, 
or c$ < Pj\i]\ = PjC\ tan6\ < 2pj^\8\ < 2c$ and then 

cos(pjr]) — e~ Pj ^ > cos(pjT]) — e~ Pj ™ > c\\. 
Consequently, 

T 2 > C12 E e ~ PA - 

C8<Pj£<C8/\0\ 

For T3, which has at most £,\6\/c% terms, we subtract e~ Pj ^ from each term 
and use the trivial estimate | cos(pjn) — e~ Pj ^ — 1| < 3, finding 

T 3 - Yl e- p ^ = 0{£\6\e- c *l\ e \) = 0{n6 i ). 

Pj£\0\>C8 

Combining these estimates, we obtain, using Lemma 4.1, 

(5.9) S^cJ J2 Pjt+ E e- p A+O(ne 4 )>c u v(0 + O(ne 4 ). 

The estimates (5.7), (5.8), (5.9), the Taylor expansion £ = ncos# = n — 
n6 2 /2 + O(n0 4 ), and the inequality 1 — cos x > 2x 2 /it 2 for x G [— ir, n] yield 

\P{ne w , e^)| < exp(n - n6 2 /2 - c l5 <p 2 v{0 + 0(rafl 4 ) + 0(y/n6 2 v(£)<p 2 )). 

The required result (5.6) now follows, using a 2 (n) < v(n) = 0(£), provided 
|#| < ci6 and n# 2 < cnv(^)ip 2 . In both the remaining cases, the result follows 
from Proposition 5.3 if cq and cj are small enough. □ 

5.2. Local expansion for P(ne l9 ,e lip ). We first rewrite (3.5) as 

(5.10) P(z,e^) = e z Y[G( Pj z,^), 

3 

where 

(5.11) G{z, C) := 1 + (1 - e" 2 )(e c - 1) = e c - e c " z + e~ z , 
with z,(gC. We begin with an expansion of G. 
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Lemma 5.7. If \ argz\ < tt/3 and |£| < c±s, then 
G(z, C) = exp((l - e~*)C + \{e~ z - e~ 2z )C 2 + 0(|C|V & (1 - e"**))). 

Proof. Choose ci 8 such that |e^ - 1| < 1/4 when \(\ < ci 8 . Let 

Z?:={(z,C):|argz|<^/3,|CI<ci 8 }. 

Then, for (z, Q G D, 

|G(z,C)-l| = |l-e- 2 ||e^-l|<2-i = i. 

Hence, g(z,() := logG(z,0 is well defined on D; moreover, \G(z, C)| > 1/2 
on D. Straightforward calculus yields, on D, 

Q (i- e - z ) e C e ~ z 

(5.12) izg(z,Q= { ' A =l- 



9C yv,w " G(z,Q G(z,(Y 

(5-13) ^Lg{*,0 



9C 2^-w G(ZjC)2 G(Z7C) 2 



52 „,_ ^ _ e " 2 ^^(^0 _ e - z (l-e- z )e< 



(5.14) _ 9 (,, C) = (| e -(i_ e - ) | ) . 



Since | argz| < tt/3, we see that if \z\ < 1, then |1 — e~ z \ = 0(\z\) = 0(Jfcz) = 
0(1 — e~^ z ), and if \z\ > 1, then |1 — e~ z \ < 2 = 0(1 — e -3 * 2 ). Hence, in either 
case, 1 - e" 2 = 0(1 - e"^), and by (5.14), we get ^g(z,Q = 0{e~ Uz (l - 
e~ nz )) on D. Moreover, G(z,0) = 1 so 5(2, 0) = 0, and, by (5.12) and (5.13), 
■^g(z,0) = 1 — e~ z and ■^ 7 g(z,0) = e~ z (l — e~ z ). This proves the lemma. 
□ 

Proposition 5.8. If\6\ <ir/3 and \tp\ <c\s, then 

P{ne w , e lLp ) = exp(ne ie + fi{n)i(p - u(n)cp6 - \v{n)tp 2 

+ 0{v(n cos 6)\ip\ 3 + n6 2 \ip\)). 

Proof. Let z := ne and £ := ^.z = ncosO. By (5.10) and Lemma 5.7, 

P(z, e [ n = exp(z + n(z)i<p - \v{z)^ 2 + O(v(0\<f\ 3 )). 

Observe that fx'(z) = J2jPj e ~ PjZ = u(z)/z and (i"(z) = — Y,jPj e ~ PjZ - Thus, 
by a Taylor expansion and Lemmas 5.4 and 5.5, 

(5.15) u{z)=u{n)+u{n)\0 + O{n9 2 ). 

Similarly, using the inequality u 2 (x)/x < v(x), 



\v'(z) 



J2(2 Pj e-^- Pj e 



Pi* 



<3u(0/Z<Z(<0/0 1/2 , 
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and thus 

v(z)=v(n)+0(\9\(nv(C)) 1/2 ). 

The desired result follows from these estimates and the inequality (to(^)) 1 ' 2 x 
\e\ip 2 <v(£)\ip\ 3 + n6 2 \ip\. □ 

5.3. Proof of Theorem 2.4 when J2 P n>iPj — V^- The remaining analy- 
sis is straightforward. We assume that o~ 2 (n) > 1, since otherwise the result 
is trivial. Recall that 6q : = n _1 / 2 cr(n) 1//3 < n -1 / 3 and tpo := cr(ra) -2 / 3 . We 
assume that J2 P n >iPj — 1/2; an d thus a 2 {n) x v{n) by Proposition 4.3. 

We start from (5.1) and split the integral into the three parts in (5.2): 

P(Z n = m) 



| n -n 



n\n 



(2vr)2 \JJW^o + JJ \o\<fo + JJe <\e\<^ 



-vrrup—'mQ 



xP(ne ie ,e [,fi )d6dip 



=: Ji + J 2 + J 3 - 
Observe first that, by Stirling's formula, 



n\n n 



^ = (27r)^/ 2 v^e-"(l + 0(l/n)) 



and thus 



U-n 



nin 



(5.17) __ = ( v ^ e -«). 

Obviously, by Proposition 5.3 and (5.17), 

j 3 = f^i f°° e~ C5n82 de) = 0{n^ 2 6^e~^<) 

(5.18) 

= 0( e - C5<T (™) 2/3 ) = 0(a" 2 (n)). 

On the other hand, Proposition 5.6 gives, for n > C3 and |0| < #0i 

j 2 = o (yfa r e- cen92 de r e - c ^ 2 («)^ 2 dip ) 

(5.19) 

= O(e- C7ff2(n)¥, o) = 0(e~ C7CT(n)2/3 ) = 0(^ 2 (n)). 

We turn to J\, the main term. If n > C4 and <r 2 (n) > C5, then Proposi- 
tion 5.8 applies when \9\ < 6$ and \tp\ < ipo, and shows, together with Propo- 
sition 4.3 and Lemma 5.5(iv), that 

P(ne w , e iip ) = exp(n + mO - \n6 2 + n(n)i<p - u(n)ip9 - \v{n)ip 2 + R(0, if)), 
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where 

R(9, ip) = 0(cr 2 (n)M 3 + n0 2 \<p\ + n\9\ 3 ) = O(l). 
Let 

(5.20) K(9, ip) := exp(i(/i(n) - m)<p - \n9 2 - u(n)9ip - \v{n)<p 2 ). 
Then, by the inequality \e z — 1| < |z|e' 2 ', 

rj \ rj -n p n rip p9o 

(2vr)^ J-tpoJ-eo 

n\n~ n p n CPO r^o 

K(9,<p)(l + 0(\R(9,<p)\)d9d<p. 



(2vr) 2 J-^J-eo 
We first estimate the error term. Observe that 



\K(9,<p)\ = exp(-±n9 2 - u(n)9ip - ±v(n)ip 2 ) = exp(-^A(^/n9, Jv(ri)<p)), 

where A is the quadratic form 

, , . o 9 u(n) 

A(x, y) := x 2 + y 2 + 2 r-^ xy. 
\Jnv[n) 

Since 

u(n) \ u{n) 2 jn v(n) — a 2 {n) 



y/nv(n)J v(n) v(n) 

by Proposition 4.3(h), we see that A(x,y) > C2o{x 2 + y 2 ), implying that 
(5.21) \K(9,<p)\ < e -^me 2 -c 21 a 2 (n) v \ 

It follows that 

n\r>~ n p n ffio fOo 

K(9,^)\R(9,v)\d9dp 



(2vr) 2 J-pJ-eo 

/oo /* oo 
/ (<r 2 (n)M 3 + n9 2 \ip\ + n |6/|3) e - C 2m0 2 -c 21CT 2 (n)^ d9dip 
-ooJ — oo 

= 0(a{n)- 2 + n- 1 l 2 a{n)- 1 ) 
= 0(a(nY 2 ). 

It remains only to evaluate the integral of K over the remaining region. 
The estimate (5.21) implies, arguing as for J3 and J2 in (5.18) and (5.19), 
that 

v I w J \9\<6 J J \6\>0 a 7 
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Collecting the estimates above, we get 

(5.22) F(Z n = m)= ) / / K{6,<p)d0d<p + 0(a(n)- 2 ). 

Since (5.20) can be rewritten as 

K(9,ip) = exp(— i(m — fi(n))ip — \n{6 + {u(n)/n)ip) — \o~ (n)ip ), 
it then follows that 

r r K(9^)dedi P = n^ 2 a(n)~ 1 exp( ( m ~ ^ n )Y 



2-kJ-ooJ-oo ' J V 2a 2 (n) 

which, together with (5.22) and (5.16), completes the proof; note that if 
m = l/J,(n) + xa(n)\ , then (m — fj,(n))/a(n) = x + 0(l/a(n)), and that x \— ► 
e~ x /2 has bounded derivatives. The assumptions above that n and cr 2 (n) 
be large are harmless since n > <r 2 (n) and the result is trivial for a 2 (n) < Cj, 
for any fixed Cj. 

Remark 5. 1 ( Central limit theorem) . If one is interested in proving only 
the central limit theorem, then Propositions 5.3 and 5.8 suffice. If, moreover, 
a Berry-Esseen bound is desired, then Proposition 5.6 is needed for \(p\ < e 
for some e > 0. 

6. Proof of Theorem 2.4 when J2 P n <iPj ^ 1/2- We consider in this 
section the case when J2 P n <iPj ^ 1/2- Our underlying idea is then to study 
n — Z n instead of Z n , and the corresponding Poissonization Z(n); see Re- 
mark 3.2 and recall that Yai(Z(n)) = v(n) x <r 2 (n). We find F(Z n = m) = 
P(n — Z n = n — m) by extracting coefficients in P(e lv> \, e~ lip ); see (3.6). This 
yields the integral formula 

¥(n — Z n = n — m) 

Note that this formula also follows directly from (5.1) by a simple change of 
variables; there is thus no formal need of Z and the motivation above. 

The analysis of this double integral is very similar to the one in Section 5; 
the main difference is that the occurrences of v (n) in our estimates have to 
be replaced by v(n). 

6.1. Estimates for \P{ne l9+llp ,e~ llp )\. We begin with a companion to 
Lemma 5.2. 

Lemma 6.1. If < r < 1 and \9\ < n, then 

l + |e reW -l|<e r - CMr3 * 3 . 
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Proof. By Lemma 5.1, we have (5.4) and thus 

e~ r (l + \e rei<> - 1|) < e~ r + (1 - e^e"^ 2 

= l-(l-e- r )(l-e- C4re2 ) 
< exp( _ (1 _ e -)(l_ e - C 4^ ))) 

and the result follows. □ 

Lemma 6.2. Uniformly for \0\ < it and — oo < ip < oo, 

\P(ne w ,e iv )\ ^e"^ 23 *^. 

Proof. A simple consequence of (3.5), Lemmas 5.2 and 6.1 and (4.5). 



□ 



Lemma 6.3. (i) lfr>0, \6\ < vr and \(\ = 1, then 

|C + e CreiS -l|<e r . 

(ii) If, furthermore, < r < 1, £/ien 

|C + e CreW -l|<e r - ,a4r * a . 

Proof. Expanding the function £ + e^ re ' — 1 at r = gives 

\(re w \ k 



| C + e C-' _ 1 |<| C + Cre i«| + ^ 



(6.2) =|l+ re i9 | + ^ r 



fc>2 



A-! 



fe>2 



|l+re i9 |+e r -r-l. 



Part (i) follows immediately. On the other hand, since 

|l+re i9 | 2 = (l + r) 2 -2r(l-cos<9) 
<(l + r) 2 -4c 4 r# 2 , 
we have the inequality 

\l + re w \<l + r-c 25 r6 2 (re [0,1]). 
This together with (6.2) yields 

|C + e Crei9 - 1| < e r - c 25 r6 2 < e r (l - c 24 r6 2 ) < e r " c ^ 2 
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uniformly for rG [0,1]. □ 

The next proposition is the analogue of Proposition 5.3 when 2~2 pn <iPj — 

1/2. 

Proposition 6.4. Assume that Y^pn<iPj — V^- Then uniformly for 
\0\ < ir and — oo < ip < oo, 

|P(ne i(e+ ^,e-^)|<e n - C26ne2 . 
PROOF. By (3.5) and Lemma 6.3 with ( = e i<p , 

\P(ne^ + ^ 1 e- i n\=Y[\ eiV + eP3nei(9+V) ~ *l 



< | TT e Pjn~c 2i p j ne 2 \ I TT e Pjn 
\pjii<l / \pjn>l / 



exp n — C240 2 ^ Pjn . 

\ p,n<l ) 



Pjl 
The corresponding analogue of Proposition 5.6 is the following. 



□ 



Proposition 6.5. Assume that 2~2„. n <iPj > 1/2. Then uniformly for 
\0\ < 7r and \(p\ < ir, 

\P(ne^ e+ ^,e-^)\ < exp(n - c 27 {n0 2 + o- 2 {n)<p 2 )). 

Proof. If \ip\ < 2\6\, then n9 2 + a 2 (n)ip 2 = 0(n9 2 ), and the result fol- 
lows by Proposition 6.4. 

On the other hand, if \cp\ > 2\0\, then \0 + ip\ < §|</?| < §7r. Note that 
Lemma 6.2 extends to \0\ < |7r (with a new C23) since if ir < \9\ < §7r, we 
may replace 9 by ± 27r. Hence, by Lemma 6.2 and Proposition 6.4, 

e - n |P(7ie i ( e+ ^,e-^)|<exp(-i(c 23 v(n)^ + ^) 2 + C26 7i0 2 )), 

and the result follows because a 2 (n) = 0(v(n)), (p 2 < 20 2 + 2(0 + i^) 2 , and 
thus n0 2 + (T 2 (n)^ 2 = O(n0 2 + v(n)(0 + <^) 2 ). D 

6.2. Local expansion for P(ne l9+lip , e~ lip ). We turn now to a local expan- 
sion of P(ne w+ ^,e- i,fi ). We will use 

(6.3) P(ze^, e"^) = [](1 + &»**-* - e~^) = e z \{ H( Pj z, ip), 

where we define 

(6.4) H(z, C) = e~ z (l + e ze< ~( - e~ c ) = e*^" 1 )^ + e -*(i _ e -C). 
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Lemma 6.6. If\z\ < 1 and \(\ < C28, then 
H(z, C) = exp((z - 1 + e~ z )C + ± (z + e~* - e~ 2z - 2ze~ z )C 2 + 0(\z 2 ( 3 \)). 

Proof. Note first that H(z,0) = 1. Hence, for \z\ < 1 and |£| < C28, we 
have \H(z,0) - 1| < 1/2. Thus /i(z,0 := logif(z,0 is well defined in the 
domain D := {(z,C) : M < 1) ICI < C28j-, with all its derivatives bounded and 
h(z,0) = 0. Moreover, h(0,Q = because fl"(0,C) = 1- Also, by 

d_ 
dz 

we have ^(0,0 = ^(0,0 = 0. 



^-ff (z, C) = (e C - lje*^- 1 )"^ - e~*(l - e~<) 



dz \ ' S/ g z - 

Consequently, 

<9 3 d d 3 d 3 d 

w h(0,C) = and __fc( ,C) = ^5^(0,0 = 0, 



and a Taylor expansion in z yields, for (z,Q € D, 

f) 3 

w h(z,c) = o(\zf). 

Hence, by another Taylor expansion, now in £, for (z,Q € -D, 
h(z,C) = -^h(z,0)C + ^h(z,0)( 2 + O(\z 2 ( 3 \), 



and the result follows, with the values of ^h(z,0) and ^W/i(z,0) obtained 
by straightforward calculus. □ 

Lemma 6.7. If \ &rgz\ < 7r/4 and |£| < C29, then 
H(z, = exp((z - 1 + e" 2 )C + £(* + e" 2 - e" 22 - 2ze~ z )C 2 + 0(K 3 |))- 

PROOF. By the definitions (6.4) and (5.11), with w := ze^, 

H(z, = e- 2+u; (e- c + e"™ - e -10 ^) = e" 2+u; G(w;, -(). 

Thus, by Lemma 5.7, for | argz| < 7r/4 and |£| < C29, 

if(z,0 = exp(z(e«-l)-(l-e- u ')C 
(6.5) 

+ i(e— -e- 2 -)C 2 + 0(|C| 3 )). 



Moreover, for argz < 7r/4 


and 


ICI 


< 


C29 


9 p -*e< 


= -ze^e~ zt 










82 *-* 

d( 2 


= ((^) 2 - 


ze c ) 


e~ 


ze^ 


= 1 



0((\z\ 2 + \z\)e- ca °\ z \) = 0(l), 
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and thus 

e- w = e-*-^e-*C + 0(|C| 2 )- 

The result follows by substituting this and e^ — 1 = ( + ^( 2 + 0(\(\ 3 ) in (6.5), 
provided \z\ > 1. The case \z\ < 1 is a consequence of Lemma 6.6. □ 

Lemma 6.8. Uniformly for IRz > 1 in the sector | argz| < vr/4, 

\v'(z)\=0{v{\z\)/\z\). 

PROOF. Let t{z) := z + e~ z (l — e~ z — 2z). Then v{z) = Y,j T (Pj z ) an d 
v'{z) = J2jPj T '(Pj z )- Since r'(0) = 0, we see that r'(z) = 0(\z\) for \z\ < 1. 
Furthermore, it is easily seen that t'(z) = O(l) when \z\ > 1 in the sector 
| argz| < 7r/3. Hence, using (4.5), 

\v\z)\<CzY,Pi(j>M^) 

j 

= C & x\z\~ l J2(PM) 2 A (Pj\z\) < C 9 \z\- l d(\z\). 
j 

This completes the proof. □ 

The next result gives the analogue of Proposition 5.8. 

Proposition 6.9. If\9\ < 7r/4 and \(p\ <c 3 i, then 

P(ne ie+iv , e~ iip ) = exp(ne ie + (n - n(n))vp - (n - u(n))tp6 - \v{n)^ 2 
(6.6) 

+ 0(v(n)\<p\ 3 + n6 2 \v\)). 

Proof. Let z := ne l ® . It follows from (6.3), Lemmas 6.6 and 6.7, and 
(2.2)-(2.6) together with (4.5) that, assuming \ip\ <csi, 

P(ze lv ,e- iLp ) = exp(z + \{z - fi(z))p - \v(z)ip 2 + 0{v(n)\<p\ 3 )). 

By (5.15), 

z — fj,(z) =n — fi(n) + W(n — u(n)) + 0(n9 ). 

On the other hand, by Lemma 6.8, we also have 

v(z) =v(n) +O(\0\v(n)), 

and the result (6.6) follows, in view of the inequalities v(n)\6\(p 2 < v(n)\(p\ 3j t- 
v(n)6 2 \(p\ and v(n) < n. □ 
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6.3. Proof of Theorem 2.4 when J2 P n <iPj — V 2 - The analysis of (6.1) 
is essentially the same as was done for (5.1) in Section 5.3, now using Propo- 
sitions 6.5 and 6.9 and the relation 

v(n) = (n — u{n)) /n + a (n), 

which follows from (2.5) and (2.6). We omit the details. 

7. Proofs of Theorem 2.1 and Corollary 2.5. 

Proof of Theorem 2.1. We may assume that a F (n) > 1 and Vax{Z n ^) > 
1, since otherwise Var(Z nj i?) < C\o by Theorem 2.3 and the result is trivial. 
Then (2.1) is a simple consequence of Theorems 2.4 and 2.3. □ 

Proof of Corollary 2.5. (i) <*=>■ (ii). An immediate consequence of 
Theorem 2.3. 

(i) => (iii). By Theorem 2.1. 

(ii) =^- (iv). By Theorem 2.4. 

(iii) =^- (v) and (iv) =^ (v). Trivial. 

(v) ==>■ (i). (This part is standard and uses the fact that Z v assumes only 
integer values.) If (v) holds, let Z' v be an independent copy of Z v . Then 

(7.1) (Z v -Z' v )/p v -±+N(0,2). 

If (i) fails, then there is a subsequence {n u ,F u ) u <zNi , along which <t^ F is 
bounded; we consider that subsequence only, and let B := s\rp v&N , j3 u . 

If B = oo, there is a subsubsequence along which j3 v — > oo, but this implies 
E((Z U - Z' V )/(5 V ) 2 = 2al vF J0i -+ 0, and thus (Z„ - Z' U )/[3 V -^ along the 
subsubsequence, which contradicts (7.1). 

On the other hand, if B < oo, then P((Z„ - Z' V )//3 V G [1/4B, 1/25]) = 
for all v € N' since Z v — Z' v is integer- valued, which again contradicts (7.1). 
□ 

8. Limit laws when the variance is small or bounded. We briefly consider 
the possible limit laws for a sequence of random variables Z n = Z n ^ n with 
bounded variances. [Recall that Corollary 2.5 shows that Z n is asymptoti- 
cally normal in the opposite case when Var(Z n ) — > oo.] By Theorem 2.3, this 
assumption is equivalent to a 2 (n) = O(l), and according to Proposition 4.3 
and Remark 4.1, we consider the following two cases: 

(i) T, Pj n<iPj > V 2 > v(n) = 0(1), v{n) x n; 
(ii) E Pj n>iPj > V 2 : v(n) = 0(1), v(n) x n. 
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In both cases we can use the same Poissonization procedure as above, the 
proofs being indeed much simpler. However, for more methodological inter- 
ests, we use the coupling argument mentioned in Remark 3.1. We say that an 
event holds whp (with high probability), if it holds with probability tending 

to 1 as n — > oo. 

Proposition 8.1. (i) If n — > oo with v(n) = 0(1), then whp n — Z n = 
Z(n). 

(ii) If n — > oo with v(n) = 0(1), then whp Z n = Z(n). 

PROOF. Let A± := n ± n 2 / 3 . Then, whp, N(XJ) <n< N(\ + ), and thus 
Z(X-) <Z n < Z(\ ± ) and Z(A_) <n - Z n < Z(X + ). Moreover, Z(X_) < 
Z(n) < Z(A+) and Z(A_) < Z(n) < Z(\ + ). Consequently, it suffices to show 
that whp Z(A_) = Z(\ + ) in case (ii) and Z(A_) = Z(X + ) in case (i). 

In case (i) we have for A_ < A < A + , using (4.5), 



-£-E(Z(A)) = -j-(\ - M (A)) = 1 - Y.P,e-" ,X = S>(1 " e_RA ) 



J 



< Y,P'M X A 1) = O \J2(p 2 3 n A Pj ) J 



0(v(n)/n), 



and thus 

P(Z(A+) + Z(A_)) < E(Z(A+) - Z(A_)) = 0((A + - A_)S(n)/n) 
= 0(n~ 1/3 ?}(n))=o(l). 
In case (ii) we have by Lemma 5.4, for A > 2, 

J-E(Z(A)) = 5> e -»* = ^ = ^A" 1 log(AMA)). 

3 
By Lemma 5.5 we thus have, for A £ [A_, A+] (and n > 2) 

— E(Z(A)) = G^ra" 1 log(n)u(n)) 
dX 

and accordingly 

P(Z(A+) / Z(A_)) < E(Z(A+) - Z(A_)) = 0(n~ l l z log(n)v(n)) = o(l). □ 

Limit results can now be obtained from the representations Z(n) = J2j !{[/ >i} 
and Z(n) = J^j ^j' with independent summands given in Section 3. 

We consider in detail two simple cases leading to Poisson limit laws. Both 
cases are marked by the property that there are no pj of order 1/ra; compare 
Chistyakov [6] and Kolchin, Sevast'yanov and Chistyakov [23], III. 3. 
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Theorem 8.2. Suppose that \ J2j(Pj n ) 2 — > A < oo and that rn.ax.jPjn — ► 
0; then n — Z n — ► Po(A). 

Proof. We have, by (4.5), v(n) = 0(Y J j{Pjn) 2 ) = 0(1), so Proposi- 
tion 8.1(i) applies. Further, we have by Section 3, with Uj ~ Po(p_,n) in- 
dependent, 

E F(Uj >2)=J2 HUj > 3) < J2 (Pi n ? < max( Pj n) ^( Pj n) 2 -> 0. 
i i i j 

Thus, whp, 



n-Z n = Z(n) = E^ = E J : 



.;• 



where Ij := lrj/. =1 i = l{u j= 2} ~ Be(2(pjn) 2 e Pjn ) are independent Bernoulli 
distributed variables. We have, asm oo, maxjE(ij) — ► and 

£E(/i) = E iM^" = E Up^) 2 + o few) - A - 

Hence J2j Ij — * P°(A) by a standard result; see [14, 25] or, for example, [3], 
Theorem 2.M. □ 

Theorem 8.3. Suppose that: 
(ii) E P ,n>ie-^^A 2 G[0,oo), 



(hi) swpjijpjnA {pjn) 1 )^0, 



d 



and let m := #{j :pj > 1/n}. TTien Z n — m — > W\ — Wi, where W\ ~ Po(Aj) 
are independent. 

Note that m depends on n and the pfs. 

Proof. We have, by (4.4), v(n) = 0(1), so Proposition 8.1(h) applies 
and whp Z n = Z(n) = J2j Uj, where, by Section 3, Uj ~ Be(l — exp(— Pjn)) 
are independent. Hence, whp, 

Z n -m= E Uj- E 0--U S ), 

Pjn<l pjn>l 

where the two sums of independent Bernoulli variables are independent. We 
have, as n — > oo, 

sup E(C/j) < sup pjn < sup(pjn A (pjn)" 1 ) — ► 

Pjn<l Vi n "^A j 
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and 

J2 Wj) = E (l-e~ Pjn )= E Pi» + <>( E (PjnA^Xu 

Pjn<l Pj"<l Pj n <i \pj-n<l / 

because 

V] (pjn) < sup(pjn A (pjn)~ )^p.,n— >0. 
Pj-n<l ■? j 

Hence Z)pn<i^j — >-Wi~Po(Ai), again by [14, 25] or, for example, [3], 
Theorem 2.M. 

Similarly, £ Pj . n>1 (l - Uj) -if 2 ~ Po(A 2 ). □ 

Remark 8.1 (Poisson approximation). We can derive more precise local 
limit theorems by modifying our proof for Theorem 2.4; the proof is indeed 
much simpler and omitted here. 

Theorems 8.2 and 8.3 extend to the general case when some pj is of the 
order 1/n, but the limit distributions become more complicated. Consider 
first case (i), with v(n) = 0(1). We may assume that, for each n, pi > p 2 > 
•••; by (4.5), p\n = O(l) and we may by taking a subsequence [of (n,F)] 
assume that pjn — ► qj for every j and some qj E [0, oo). (Thus, Theorem 8.2 
is the case when all qj = 0.) If we further assume, without loss of generality, 
as in Theorem 8.2, that ^J2j(Pj n ) 2 ~^ ^ f° r some A < oo, and let A' := 
A — \ Ej Qj> it can be shown by arguments similar to those above that 



n 



Zn^W + Y^V^ 



where W ~ Po(A'), Vj := Vj — l{y>n with Vj ~ Po(q r J ), and all terms are 
independent. Note that the limit depends on the sequence {qj}', thus, in 
general, different subsequences may converge to different limits, even if the 
limit A exists. 

Similarly, in case (ii), we may rearrange (pj) into two (finite or infinite) 
sequences (pi-) and (p'') with 1/n > p\ > p' 2 > • • • and 1/n < p'{ < p 2 ' < ■ • ■, 
and by selecting a subsequence we may assume that p'n — > q'j and p'-n — ► </' 
for some g'- € [0, 1] and q'j G [1, oo]. (If the sequences are finite, extend them 
by 0's or oo's.) It can be shown that if Ai, A2 and m are as in Theorem 8.3, 
then 

Z n -m^W' 



where W ~ Po(Ai - E,^), W ~ Po(A 2 - E,e~ 9 ' ), V- ~ Be(l - e 



i 


-W"- 


-E^"> 


' Po(A 2 


"Ei 


e-<), v; ~ 



-</ 



V/' ~ Be(e ^ ), and all terms are independent. We leave the details to the 



3 
reader. 
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9. Fixed distribution. We briefly discuss a few characteristic properties 
for the case when the distribution F is kept fixed while n — > oo. We may as 
in Remark 3.1 assume that the sequence (Z n ) is obtained by throwing balls 
one after another; thus Z\ < Z2 < • • • . 

Let M := #{j :pj > 0}, the number of distinct values that AQ can take 
with positive probability. If M is finite, then a.s. all these values are sooner 
or later assumed by some Xi, and thus Z n = M for large enough n. In other 
words, then Z n = M whp, and Z n — ► M as n — > 00. 

We will therefore in this section assume that M = 00. It is then easily 
seen that Z n — ► 00 a.s. as n — ► 00; similarly Z{\) — > 00 a.s. as A —> 00. Con- 
sequently, E(Z„) — > 00 asn->oo and /i(A) = E(Z(A)) — ► 00 as A — ► 00. 

On the other hand, by (2.2) and the dominated convergence theorem 



H{x)/x = ^2(1 - e PjX )/x -»■ asx 



00, 



since < (1 — e~ PjX )/x < pj and SjPj < °°; see a l so Karlin [20] for an 
alternative proof. In other words, /j,(x) = o(x) as x — > 00, and thus, by The- 
orem 2.3, E(Z n ) = o(n) as n — > cxd. 

Similarly, the O(l) terms in Theorem 2.3 can be improved to o(l); these 
remainder terms are given in our proof in Section 4 as sums, where each 
term tends to and domination is provided by the estimates given in our 
proof. 

Finally, J2 P x >iPj ~^ J2 P >oPj = 1 as x ~~ * °°; ^ nus we always have 
X) p x>iPi ^ 1/2 for large x. Hence a 2 (n) x v(n) and, for limit results, we 
only have to consider the case in Section 5. 
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